Vocal tract normalization based on spectral warping

نویسندگان

  • Wei Wang
  • Stephen A. Zahorian
چکیده

Two techniques for speaker adaptation based on frequency scale modifications are described and evaluated. In one method, minimum mean square error matching is performed between a spectral template for each speaker to a "typical speaker" spectral template. One parameter, a warping factor, is used to control the spectral matching. In the second method, a neural network classifier is used to adjust the frequency warping factor for each speaker so as to maximize vowel classification performance for each speaker. A vowel classifier trained only with normalized female speech and tested only with normalized male speech, or vice versa, is nearly as accurate as when speaker genders are matched for training and testing, and the speech is not normalized. The improvement due to normalization is much smaller, if training and test data are matched. The normalization based on classification performance is superior to that based on minimizing mean square error.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speaker normalization based on test to reference speaker mapping

The paper presents the speaker normalization technique we implemented in a teaching and training system for hearing handicapped children with the goal to reduce inter-speaker variability in time-frequency speech representation. In an effort to reduce variance caused by variation in vocal tract shape among speakers, a formant based nonlinear frequency warping approach to vocal tract normalizatio...

متن کامل

Knowledge-Rich Model Transformations for Speaker Normalization in Speech Recognition

In this work we extend the test utterance adaptation technique used in vocal tract length normalization to a larger number of speaker characteristic features. We perform partially joint estimation of four features: the VTLN warping factor, the corner position of the piece-wise linear warping function, spectral tilt in voiced segments, and model variance scaling. In experiments on the Swedish PF...

متن کامل

Investigating Explicit Model Transformations for Speaker Normalization

In this work we extend the test utterance adaptation technique used in vocal tract length normalization to a larger number of speaker characteristic features. We perform partially joint estimation of four features: the VTLN warping factor, the corner position of the piece-wise linear warping function, spectral tilt in voiced segments, and model variance scaling. In experiments on the Swedish PF...

متن کامل

تخمین سریع ضرایب پیچش در هنجارسازی طول مجرای صوتی با استفاده از امتیاز به دست آمده از مدلسازی تشخیص جنسیت

The performance of automatic speech recognition (ASR) systems is adversely affected by the variations in speakers, audio channels and environmental conditions. Making these systems robust to these variations is still a big challenge. One of the main sources of variations in the speakers is the differences between their Vocal Tract Length (VTL). Vocal Tract Length Normalization (VTLN) is an effe...

متن کامل

Region-based vocal tract length normalization for ASR

In this paper, we propose a Region-based multi-parametric Vocal Tract Length Normalization (R-VTLN) algorithm for the problem of automatic speech recognition (ASR). The proposed algorithm extends the well-established mono-parametric utterance-based VTLN algorithm of Lee and Rose [1] by dividing the speech frames of a test utterance into regions and by warping independently the features correspo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004